Comparing time-frequency representations for directional derivative features

نویسندگان

  • James Gibson
  • Maarten Van Segbroeck
  • Shrikanth S. Narayanan
چکیده

We compare the performance of Directional Derivatives features for automatic speech recognition when extracted from different time-frequency representations. Specifically, we use the short-time Fourier transform, Mel-frequency, and Gammatone spectrograms as a base from which we extract spectrotemporal modulations. We then assess the noise robustness of each representation with varied number of frequency bins and dynamic range compression schemes for both word and phone recognition. We find that the choice of dynamic range compression approach has the most significant impact on recognition performance. Whereas, the performance differences between perceptually motivated filter-banks are minimal in the proposed framework. Furthermore, this work presents significant gains in speech recognition accuracy for low SNRs over MFCCs, GFCCs, and Directional Derivatives extracted from the log-Mel spectrogram.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Local Derivative Pattern with Smart Thresholding: Local Composition Derivative Pattern for Palmprint Matching

Palmprint recognition is a new biometrics system based on physiological characteristics of the palmprint, which includes rich, stable, and unique features such as lines, points, and texture. Texture is one of the most important features extracted from low resolution images. In this paper, a new local descriptor, Local Composition Derivative Pattern (LCDP) is proposed to extract smartly stronger...

متن کامل

Image Representations Using Multiscale Diierential Operators

Diierential operators have been widely used for multiscale geometric descriptions of images. The eecient computation of these diierential operators is always desirable. Moreover, it has not been clear whether such representations are invertible. For certain applications, it is usually required that such representations should be invertible so that one can facilitate the processing of informatio...

متن کامل

EMG-based wrist gesture recognition using a convolutional neural network

Background: Deep learning has revolutionized artificial intelligence and has transformed many fields. It allows processing high-dimensional data (such as signals or images) without the need for feature engineering. The aim of this research is to develop a deep learning-based system to decode motor intent from electromyogram (EMG) signals. Methods: A myoelectric system based on convolutional ne...

متن کامل

A Novel Intelligent Fault Diagnosis Approach for Critical Rotating Machinery in the Time-frequency Domain

The rotating machinery is a common class of machinery in the industry. The root cause of faults in the rotating machinery is often faulty rolling element bearings. This paper presents a novel technique using artificial neural network learning for automated diagnosis of localized faults in rolling element bearings. The inputs of this technique are a number of features (harmmean and median), whic...

متن کامل

Free Axisymmetric Bending Vibration Analysis of two Directional FGM Circular Nano-plate on the Elastic Foundation

In the following paper, free vibration analysis of two directional FGM circular nano-plate on the elastic medium is investigated. The elastic modulus of plate varies in both radial and thickness directions. Eringen’s theory was employed to the analysis of circular nano-plate with variation in material properties. Simultaneous variations of the material properties in the radial and transverse di...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014